Quantitative approaches to diachronic corpus linguistics

نویسندگان

  • Martin Hilpert
  • Stefan Th. Gries
چکیده

English Historical Linguistics has a rich and long-standing tradition of corpus-based work (cf. the surveys in Rissanen 2008, Kytö 2012). Resources such as the HELSINKI corpus, the BROWN family of corpora, and ARCHER have spawned active research programs for the study of lexical and grammatical change, both long-term (Curzan 2008) and short-term (Mair 2008). In addition, corpus resources inform the analysis of diachronic variation in genres (Mair & Hundt 1999), registers (Biber & Gray 2011), and varieties (Tagliamonte 2006a). The present chapter will discuss a currently developing line of research which uses the methods of quantitative corpus linguistics for the analysis of diachronic corpora. This research program draws on, and is informed by, the aforementioned areas, but at the same time, it uses particular kinds of data and handles that data in specific ways that merit discussion. Diachronic corpora are understood here as textual resources that represent comparable types of language use over sequential periods of time, thus comprising at least two periods, as in the Diachronic Corpus of Present-Day Spoken English (DCPSE, Wallis et al. 2006), but typically many more, as in the Corpus of Historical American English (COHA, Davies 2010), a monitor corpus which at the time of writing samples 21 sequential decades of language use. The English diachronic corpora that are currently available represent different varieties and text types and vary in their respective time depths, but it is a design feature of most diachronic corpora to hold the type of text constant, so that diachronic language change within a given text type may be studied with as few confounding factors as possible. Quantitative corpus linguistics (Biber & Jones 2008) is a research tradition in which research questions are formulated in such a way that frequency counts from corpora may provide answers. Quantitative corpus work thus often engages in hypothesis testing, so that a testable empirical question (e.g. 'Have adolescent women been leading the development of the quotative be like in Tyneside English?') may receive an answer in terms of either 'yes' or 'no'. Of at least equal importance are so-called exploratory techniques, which are designed to transform a complex dataset into a summary (and often visual) representation (which may then be interpreted by the analyst and that may in turn lead to the formulation of hypotheses). To give an example, Szmrecsanyi (2010) studies the use of genitive constructions in different text types of British and American English in the 1960s and the 1990s, exploring whether there are changes that could be seen as Americanization or colloquialization (cf. Mair 2006). The frequency counts that enter quantitative corpus studies often represent token frequencies, but a much wider variety of measures is routinely used, including measures of type frequency, dispersion, and collocation. The main point of this chapter will be an overview of how the two, diachronic corpora and quantitative corpus linguistics, are put together in fruitful ways. Quantitative studies of how units of linguistic structure change across corpus periods can address questions of more general linguistic interest, including the following ones:

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Gearing the Discursive Practice to the Evolution of Discipline: Diachronic Corpus Analysis of Stance Markers in Research Articles’ Methodology Section

Despite widespread interest and research among applied linguists to explore metadiscourse use, very little is known of how metadiscourse resources have evolved over time in response to the historically developing practices of academic communities. Motivated by such an ambition, the current research drew on a corpus of 874315 words taken from three leading journals of applied linguistics in orde...

متن کامل

Investigating Lexico-grammaticality in Academic Abstracts and Their Full Research Papers from a Diachronic Perspective

Development of science and academic knowledge has led to changes in academic language and transfer of information and knowledge. In this regard, the present study is an attempt to investigate lexico-grammaticality in academic abstracts and their full research papers in Linguistics, Chemistry and Electrical engineering papers published during 1991-2015 in academic journals from a diachronic pers...

متن کامل

A State-of-the-Art of Semantic Change Computation

This paper reviews state-of-the-art of one emerging field in computational linguistics — semantic change computation, proposing a framework that summarizes the literature by identifying and expounding five essential components in the field: diachronic corpus, diachronic word sense characterization, change modelling, evaluation data and data visualization. Despite the potential of the field, the...

متن کامل

The Changing Face of Corpus Linguistics

Email: [email protected] This volume is witness to a spirited and fruitful period in the evolution of corpus linguistics. In twenty-two articles written by established corpus linguists, members of the ICAME (International Computer Archive of Modern and Mediaeval English) association, this new volume brings the reader up to date with the cycle of activities which make up this field of study as...

متن کامل

The co-evolution of syntactic and pragmatic complexity: diachronic and cross- linguistic aspects of pseudoclefts

This chapter examines the diachronic rise of a syntactically and pragmatically complex construction type: pseudoclefts. Given that cleft constructions combine available components of grammar – relative clauses and copular clauses – do they arise in full-fledged form? If they emerge gradually, what constrains their development? We first present a corpus-based analysis of the history of English p...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013